weather prediction
Attention-Enhanced Convolutional Autoencoder and Structured Delay Embeddings for Weather Prediction
Hedayat, Amirpasha, Duraisamy, Karthik
Weather prediction is a quintessential problem involving the forecasting of a complex, nonlinear, and chaotic high-dimensional dynamical system. This work introduces an efficient reduced-order modeling (ROM) framework for short-range weather prediction and investigates fundamental questions in dimensionality reduction and reduced order modeling of such systems. Unlike recent AI-driven models, which require extensive computational resources, our framework prioritizes efficiency while achieving reasonable accuracy. Specifically, a ResNet-based convolutional autoencoder augmented by block attention modules is developed to reduce the dimensionality of high-dimensional weather data. Subsequently, a linear operator is learned in the time-delayed embedding of the latent space to efficiently capture the dynamics. Using the ERA5 reanalysis dataset, we demonstrate that this framework performs well in-distribution as evidenced by effectively predicting weather patterns within training data periods. We also identify important limitations in generalizing to future states, particularly in maintaining prediction accuracy beyond the training window. Our analysis reveals that weather systems exhibit strong temporal correlations that can be effectively captured through linear operations in an appropriately constructed embedding space, and that projection error rather than inference error is the main bottleneck. These findings shed light on some key challenges in reduced-order modeling of chaotic systems and point toward opportunities for hybrid approaches that combine efficient reduced-order models as baselines with more sophisticated AI architectures, particularly for applications in long-term climate modeling where computational efficiency is paramount.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.05)
- Indian Ocean > Bay of Bengal (0.04)
- North America > United States (0.46)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Adaptive Spatio-Temporal Graphs with Self-Supervised Pretraining for Multi-Horizon Weather Forecasting
Accurate and robust weather forecasting remains a fundamental challenge due to the inherent spatio-temporal complexity of atmospheric systems. In this paper, we propose a novel self-supervised learning framework that leverages spatio-temporal structures to improve multi-variable weather prediction. The model integrates a graph neural network (GNN) for spatial reasoning, a self-supervised pretraining scheme for representation learning, and a spatio-temporal adaptation mechanism to enhance generalization across varying forecasting horizons. Extensive experiments on both ERA5 and MERRA-2 reanalysis datasets demonstrate that our approach achieves superior performance compared to traditional numerical weather prediction (NWP) models and recent deep learning methods. Quantitative evaluations and visual analyses in Beijing and Shanghai confirm the model's capability to capture fine-grained meteorological patterns. The proposed framework provides a scalable and label-efficient solution for future data-driven weather forecasting systems.
- Asia > China > Shanghai > Shanghai (0.26)
- Asia > China > Beijing > Beijing (0.26)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (7 more...)
MoWE : A Mixture of Weather Experts
Chakraborty, Dibyajyoti, Maulik, Romit, Harrington, Peter, Foster, Dallas, Nabian, Mohammad Amin, Choudhry, Sanjay
Data-driven weather models have recently achieved state-of-the-art performance, yet progress has plateaued in recent years. This paper introduces a Mixture of Experts (MoWE) approach as a novel paradigm to overcome these limitations, not by creating a new forecaster, but by optimally combining the outputs of existing models. The MoWE model is trained with significantly lower computational resources than the individual experts. Our model employs a Vision Transformer-based gating network that dynamically learns to weight the contributions of multiple "expert" models at each grid point, conditioned on forecast lead time. This approach creates a synthesized deterministic forecast that is more accurate than any individual component in terms of Root Mean Squared Error (RMSE). Our results demonstrate the effectiveness of this method, achieving up to a 10% lower RMSE than the best-performing AI weather model on a 2-day forecast horizon, significantly outperforming individual experts as well as a simple average across experts. This work presents a computationally efficient and scalable strategy to push the state of the art in data-driven weather prediction by making the most out of leading high-quality forecast models.
- North America > United States > Pennsylvania (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- North America > Mexico > Gulf of Mexico (0.04)
- Asia (0.04)
Breaking the Statistical Similarity Trap in Extreme Convection Detection
Current evaluation metrics for deep learning weather models create a "Statistical Similarity Trap", rewarding blurry predictions while missing rare, high-impact events. We provide quantitative evidence of this trap, showing sophisticated baselines achieve 97.9% correlation yet 0.00 CSI for dangerous convection detection. We introduce DART (Dual Architecture for Regression Tasks), a framework addressing the challenge of transforming coarse atmospheric forecasts into high-resolution satellite brightness temperature fields optimized for extreme convection detection (below 220 K). DART employs dual-decoder architecture with explicit background/extreme decomposition, physically motivated oversampling, and task-specific loss functions. We present four key findings: (1) empirical validation of the Statistical Similarity Trap across multiple sophisticated baselines; (2) the "IVT Paradox", removing Integrated Water Vapor Transport, widely regarded as essential for atmospheric river analysis, improves extreme convection detection by 270%; (3) architectural necessity demonstrated through operational flexibility (DART achieves CSI = 0.273 with bias = 2.52 vs. 6.72 for baselines at equivalent CSI), and (4) real-world validation with the August 2023 Chittagong flooding disaster as a case study. To our knowledge, this is the first work to systematically address this hybrid conversion-segmentation-downscaling task, with no direct prior benchmarks identified in existing literature. Our validation against diverse statistical and deep learning baselines sufficiently demonstrates DART's specialized design. The framework enables precise operational calibration through beta-tuning, trains in under 10 minutes on standard hardware, and integrates seamlessly with existing meteorological workflows, demonstrating a pathway toward trustworthy AI for extreme weather preparedness.
- Indian Ocean > Bay of Bengal (0.05)
- Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
- North America > United States > Pennsylvania (0.04)
- (10 more...)
- Information Technology (0.67)
- Government (0.46)
- Energy (0.46)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Huracan: A skillful end-to-end data-driven system for ensemble data assimilation and weather prediction
Ni, Zekun, Weyn, Jonathan, Zhang, Hang, Xiang, Yanfei, Bian, Jiang, Jin, Weixin, Thambiratnam, Kit, Zhang, Qi, Dong, Haiyu, Sun, Hongyu
Over the past few years, machine learning-based data-driven weather prediction has been transforming operational weather forecasting by providing more accurate forecasts while using a mere fraction of computing power compared to traditional numerical weather prediction (NWP). However, those models still rely on initial conditions from NWP, putting an upper limit on their forecast abilities. A few end-to-end systems have since been proposed, but they have yet to match the forecast skill of state-of-the-art NWP competitors. In this work, we propose Huracan, an observation-driven weather forecasting system which combines an ensemble data assimilation model with a forecast model to produce highly accurate forecasts relying only on observations as inputs. Huracan is not only the first to provide ensemble initial conditions and end-to-end ensemble weather forecasts, but also the first end-to-end system to achieve an accuracy comparable with that of ECMWF ENS, the state-of-the-art NWP competitor, despite using a smaller amount of available observation data. Notably, Huracan matches or exceeds the continuous ranked probability score of ECMWF ENS on 75.4% of the variable and lead time combinations. Our work is a major step forward in end-to-end data-driven weather prediction and opens up opportunities for further improving and revolutionizing operational weather forecasting.
- North America > United States (0.33)
- Europe > Finland (0.05)
- Indian Ocean (0.04)
- (2 more...)
- Government > Regional Government > North America Government > United States Government (0.33)
- Government > Space Agency (0.32)
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.05)
- Indian Ocean > Bay of Bengal (0.04)
- North America > United States (0.46)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Democracy of AI Numerical Weather Models: An Example of Global Forecasting with FourCastNetv2 Made by a University Research Lab Using GPU
Khadir, Iman, Stevenson, Shane, Li, Henry, Krick, Kyle, Burrows, Abram, Hall, David, Posey, Stan, Shen, Samuel S. P.
This paper demonstrates the feasibility of democratizing AI-driven global weather forecasting models among university research groups by leveraging Graphics Processing Units (GPUs) and freely available AI models, such as NVIDIA's FourCastNetv2. FourCastNetv2 is an NVIDIA's advanced neural network for weather prediction and is trained on a 73-channel subset of the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) dataset at single levels and different pressure levels. Although the training specifications for FourCastNetv2 are not released to the public, the training documentation of the model's first generation, FourCastNet, is available to all users. The training had 64 A100 GPUs and took 16 hours to complete. Although NVIDIA's models offer significant reductions in both time and cost compared to traditional Numerical Weather Prediction (NWP), reproducing published forecasting results presents ongoing challenges for resource-constrained university research groups with limited GPU availability. We demonstrate both (i) leveraging FourCastNetv2 to create predictions through the designated application programming interface (API) and (ii) utilizing NVIDIA hardware to train the original FourCastNet model. Further, this paper demonstrates the capabilities and limitations of NVIDIA A100's for resource-limited research groups in universities. We also explore data management, training efficiency, and model validation, highlighting the advantages and challenges of using limited high-performance computing resources. Consequently, this paper and its corresponding GitHub materials may serve as an initial guide for other university research groups and courses related to machine learning, climate science, and data science to develop research and education programs on AI weather forecasting, and hence help democratize the AI NWP in the digital economy.
- North America > United States > California > San Diego County > San Diego (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > North Carolina (0.04)
- (4 more...)
- Research Report (1.00)
- Instructional Material (1.00)
- Information Technology (1.00)
- Education > Educational Setting > Higher Education (0.47)